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Abstract 

Social  media  will  play  a  key  role  in  many  areas  of  intelligence  operations  with  the  development 
of  knowledge  extraction  analytics.  The  data  contained  within  these  social  networking  services 
present  many  challenges,  but  the  value  obtained  from  detecting  subtle  and  hidden  information 
exchanges  and  environmental  observations  are  vast.  The  analytics  that  will  drive  social  media 
knowledge  exploitation  include  advanced  text  processing  technologies  allowing  multi-faceted 
concept  or  frame-based  queries  within  a  familiar  search  engine  interface  for  users  who  are  not 
expert  analysts  or  who  may  be  operating  with  limited  intelligence  resources  in  disadvantaged 
information  collection  situations.  In  this  paper,  we  demonstrate  constructing  frame-based 
semantic  search  queries  using  models  of  concepts  and  roles  extracted  from  social  media  content. 
This  approach  to  searching  allows  intuitive  query  formulations  requiring  less  up-front  knowledge 
of  the  search  parameters.  We  demonstrate  this  semantic  search  capability  by  querying  over 
seven  million  tweets  captured  during  the  Arab  Spring  uprisings. 
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Introduction 

Social  media  is  steadily  growing  in  importance  as  an  information  source.  The  brevity  of  social  media 
content  (e.g.,  140  characters  per  tweet)  combined  with  the  increasing  usage  of  mobile  devices  facilitates 
the  sharing  of  information  from  anywhere  at  any  time.  This  phenomenon  is  giving  rise  to  ever  stronger 
citizen  authorship,  where  millions  of  social  media  users  share  about,  discuss,  comment  on  and  illustrate 
recent  happenings  on  a  daily  basis.  The  unprecedented  volumes  of  real-time  information  contributed  by 
large  communities  of  social  media  users  can  be  researched  and  searched  for  sources  of  information 
relevant  to  national  security. 

There  is  increasing  acceptance  within  the  intelligence  community  of  the  potential  value  of  social  media, 
but  understandably  some  reluctance  to  rank  it  alongside  more  traditional  clandestine  sources.  There  is  no 
denying  that  much  of  the  information  available  from  platforms  such  as  Twitter  and  Facebook  is  of  little 
use  from  an  intelligence  perspective.  But  there  is  still  a  great  deal  of  useful  and  actionable  information 
available  and  the  challenge  for  the  intelligence  community  is  not  only  to  learn  how  to  extract  that 
information  using  new  tools  and  techniques,  but  how  to  analyze  it  as  a  unique  and  dynamic  data  source. 

The  value  of  social  media  content  as  an  intelligence  source  comes  to  the  forefront  when  traditional 
methods  of  intelligence  collection  such  as  airborne  sensors,  photographic  devices,  infrared  detection 
systems,  listening  devices,  and  motion  detectors  are  unavailable,  for  instance,  within  a  constrained 
resource  allocation  environment.  Also,  there  can  be  challenges  and  risks  involved  in  direct  intelligence 
collection  resulting  from  unsafe  conditions  due  to  military  presence  or  conflicts  in  progress. 
Consequently,  intelligence  collection  efforts  must  sometimes  rely  on  the  eyes  and  ears  of  the  general 
population.  Members  of  the  public  go  about  their  daily  business  observing  actions  or  overhearing 
information  that  may  be  of  immeasurable  value  to  security  forces.  With  easier  and  more  widespread 
access  to  the  Internet,  the  general  population  can  share  their  observations  and  information  through 
communications  activities  enabled  by  social  media  services.  There  are  many  social  media  services  to 
choose  from:  yahoo  groups,  google  groups,  wikis,  user  forums,  Ning,  Linkedln  groups,  blogs,  Facebook, 
and  Twitter — literally,  hundreds  of  ways  to  share  information. 

While  information  collected  from  social  media  content  may  not  provide  the  information  superiority  of 
high-grade  intelligence,  this  citizen-authored  information  could  complement  various  types  of  high-grade 
intelligence.  Social  media  sources  can  provide  large  amounts  of  low-grade  information  that  when  added 
together,  provide  a  picture  of  the  political,  military,  sociological,  and  infrastructure  aspects  of  an 
operating  environment.  The  collection  of  low-grade  intelligence  is  an  approach  attributed  to  the  famous 
British  counterinsurgency  expert  General  Frank  Kitson  (Kitson,  1971). 

According  to  a  NextGov  article  (NextGov,  2012)  summarizing  a  panel  discussion  of  open-source 
intelligence  hosted  by  Government  Executive ,  there  is  potential  for  social  media  to  take  the  place  of 
clandestine  sources  when  it  comes  to  intelligence  analysis.  While  this  may  be  overly  optimistic,  and  not 
necessarily  the  desired  end  goal,  there  is  certainly  scope  for  social  media  to  provide  useful  insights  that 
can  either  inform  or  support  information  gathered  from  traditional  intelligence  sources.  Referring  to  the 
NextGov  article,  Centrifuge  Systems  (www.centribugesystems.com/news)  noted  that  social  media  could 
help  predict  large  scale  shifts  in  the  cultural,  political,  and  social  world.  An  example  would  be  the  impact 
of  social  media  on  shaping  geopolitical  development  during  the  recent  Arab  Spring  and  current  Syrian 
conflict. 

This  paper  introduces  a  new  technology  for  extracting  low-grade  intelligence  information  from  large 
streams  of  social  media  communications.  The  next  section  overviews  this  frame-based  semantic  search 
technology  within  a  suite  of  advanced  analytics  called  Contour.  Contour,  developed  by  Decisive 
Analytics  Corporation  (DAC,  http://www.dac.us/),  offers  a  range  of  capabilities  to  facilitate  the  search  of 
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social  media  content.  The  frame-based  ontology  used  by  Contour,  called  FrameNet,  is  discussed.  The 
third  section  demonstrates  Contour’s  multi-level  semantic  search  formulation  of  Concepts,  Roles,  and 
Keywords  in  the  context  of  a  large  collection  of  tweets  from  the  Arab  Spring  uprisings.  Example  search 
queries  illustrate  how  this  new  semantic  search  technology  can  support  an  analyst’s  situational  awareness, 
firstly,  as  to  what  is  occurring  within  a  specific  geographical  area,  and  secondly,  how  this  may  affect 
ongoing  missions.  The  final  section  discusses  advantages  Contour’s  semantic  search  capability  offers  in 
the  context  of  a  disadvantaged  intelligence  collection,  as  well  as,  needed  future  areas  of  development. 

Frame-based  Semantic  Search 

Text  commonly  contained  in  social  media  communication  activities  is  considered  “unstructured”, 
shortened,  informally  written  text  with  embedded  hashtags,  urls,  replies  and  mentions,  and  emoticons.  In 
addition  to  its  unstructured  nature,  the  sheer  volume  of  social  media  communications  sent  during  events 
of  interest  is  overwhelming  and  hence  difficult  to  distill  for  relevant  information.  In  order  to  utilize  social 
media  communications  for  low-grade  intelligence,  analysts  must  rely  on  automated  text  processing  tools 
to  filter  through  and  organize  these  large  bodies  of  text  into  a  rapidly-digestible  form.  Contour’s  frame- 
based  semantic  modeling  is  one  approach  to  addressing  the  text  processing  challenges  associated  with  a 
large  corpus  of  unstructured  text. 

The  Contour  platform  imports  unstructured  text  from  a  variety  of  sources  and  then  maps  the  text  to  an 
existing  ontology  of  frames  (FrameNet,  https ://framenet. icsi.berkeley.  edu/ fndrupal/)  during  a  process  of 
Semantic  Role  Labeling  (SRL).  FrameNet  is  a  structured  language  model  grounded  in  the  theory  of 
Frame  Semantics  by  Charles  Fillmore  and  colleagues  (1976).  Filmore’s  hypothesis  is  people  understand 
things  by  performing  mental  operations  on  what  they  already  know  and  such  knowledge  is  describable  in 
terms  of  information  packets  called  frames. 

Frame  semantic  analysis  is  a  technique  linguists  use  to  characterize  words  by  their  interactions  with  the 
words  around  them.  The  goal  of  frame-based  text  analysis  is  to  identify  the  words  that  evoke  specific 
concepts  in  the  reader’s  mind  (frames)  and  to  map  words  and  phrases  from  the  text  to  the  roles  that  are 
related  to  these  concepts.  A  simple  example  of  the  frame  concept  can  be  seen  in  this  sentence: 

John  bought  food. 

The  verb  “bought”  in  this  sentence  brings  to  mind,  or  evokes  in  the  reader’s  mind,  a  scenario  in  which 
goods  are  exchanged  for  money.  Using  the  preexisting  understanding  of  a  commercial  transaction  (the 
schema  for  the  commercial  transaction  “frame”)  the  reader  understands  that  John  plays  the  role  of  the 
buyer  and  that  goods  were  exchanged.  The  prototypical  commercial  transaction  scenario  acts  as  a  frame 
that  places  phrases  in  the  sentence  “John  bought  food”  into  their  semantic  roles.  Under  the  frame-based 
model  of  semantics,  understanding  language  involves  accessing  the  semantic  frames  evoked  by  the  text 
and  associating  groups  of  words  with  the  roles  defined  by  the  frames. 

The  FrameNet  lexical  database  of  English  is  both  human-  and  machine-readable  based  on  annotating 
examples  of  how  words  are  used  in  actual  text  (Ruppenhofer,  Ellsworth,  Petruck,  Johnson,  &  Scheffczyk, 
2006).  The  FrameNet  database  is  under  development  at  the  International  Computer  Science  Institute  in 
Berkeley  since  1997  supported  primarily  by  the  National  Science  Foundation.  It  consists  of  over  1,000 
semantic  frames,  their  associated  Frame  Elements  (FEs)  and  the  words  that  evoke  them.  For  example, 
Table  1  shows  the  Operate_vehicle  frame  prototypical  of  a  scenario  in  which  someone  operates  a  type  of 
vehicle. 
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Frame  Deroe  ill 

(H) 

Description 

Area 

General  area  in  which  motion  takes  pine  e 

Driver 

The  bemg  that  controls  the  vehicle  as  it  moves 

Goal 

Where  the  moving  object  ends  up 

Path 

Description  of  the  trajectory  of  motion 

Source 

Expression  that  defines  a  starting  point  of  motion 

Vehicle 

The  means  of  conveyance  controlled  by  the  driver 

Table  1.  Several  of  the  Frame  Elements  (FEs)  of  the  Operate  ^vehicle  frame.  FEs  are  the  semantic 
components  of  the  concept  of  operating  a  vehicle. 

For  each  of  FrameNet’s  frame  definitions,  the  database  defines  a  number  of  words  that,  when  used  in  the 
right  context,  evoke  that  frame.  For  example,  some  of  the  words  that  evoke  the  Operate_vehicle  frame 
are:  drive,  fly,  ride,  row,  and  taxi. 

Contour’s  semantic  search  capability  employs  Semantic  Role  Labeling  (SRL)  (Gildea  &  Jurafsky,  2002) 
to  automatically  map  words  and  phrases  from  the  source  text  into  the  elements  of  the  frames  that  are 
evoked  by  the  ideas  in  that  text.  Contour’s  SRL  algorithm  uses  a  supervised  training  technique  to  build  a 
probabilistic  model  that  maps  constituents  in  Context  Free  Grammar  (CFG)  parse  trees  to  elements  in  the 
frames  evoked  by  the  text.  This  FrameNet-based  SRL  process  builds  a  rich,  formal  model  of  meaning 
from  the  target  text.  Figure  1  shows  the  result  of  FrameNet-based  SRL  performed  on  a  simple  target 
sentence.  The  data  structure  that  contains  the  mapping  between  frame  concepts  and  the  target  text  (portion 
below  the  arrow  in  Figure  1)  is  called  the  semantic  model  of  the  text.  The  semantic  model  of  a  target 
document  acts  as  a  high-fidelity  representation  of  the  text’s  meaning. 


- — - 

Source  Text: 

The  unit  found  two  men  hiding  m  the  trunk  of  a  car  driven  across  the  border. 


S 

R 


Semantic  Role  Labeling  (SRL) 
process  identified  three  frames: 
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Frame  Metadata 
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The  unit  found  iwd  men  hiding  ur.e  trunk  :  f  a  car  driven  across  the  border. 

Element  Metadata 

Hidden  object 

Hiding  ultice  j 

Frame  Metadata 

Opera  te_  vehicle 

The  unit  found  two  men  hiding  in  the  trunk  of  a  car  driven 

ElemeD  t  Metada  fa 

Vehicle  Path  j 

\ 

i 

_ / 

Semantic  Representation 


Figure  1.  A  FrameNet-based  Semantic  Role  Labeling  process  uncovers  three  frames  in  the  target 
sentence.  Hiding  objects  (evoked  from  the  word  hiding ),  Locating  (evoked  from  the  word  found) 
and  Operate  vehicle  (evoked  from  the  word  driven).  Text  from  the  sentence  that  fulfills  the  role  of 
the  elements  for  each  frame  is  labeled  with  the  name  of  its  element.  The  data  structure  that  maps 
target  text  to  frames  and  frame  elements  is  called  the  semantic  representation. 
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In  Contour,  the  FrameNet-based  model  of  the  text  can  be  used  to  construct  queries  through  a  series  of 
filtering  widgets,  or  to  formulate  a  more  targeted  query  in  a  Semantic  Search  box.  The  filtering  widgets 
and  Semantic  Search  box  are  located  along  the  left  side  of  the  main  window  of  the  user  interface  (Figure 
2).  The  large  subwindow  to  the  right  of  the  filtering  and  search  widgets  displays  the  search  results. 


[*:*  |  Search 

Home  ►  ►  importsource:"egypt  unrest  twitter" 

back  1  2  ...8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  ...249529  249530  next  Showing  376  to  400  of  6,238,231 
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Concept  b  x 

Performers_and_roles  (354335) 

Locative_relation  (323994) 

Goal  (273861) 

Partitive  (207422) 

Temporal_collocation  (183764) 

Degree  (119230) 

Calendric_unit  (117486) 

Taking_sides  (115732) 

Origin  (114954) 

Quantified_mass  (97192) 

back  1  2  3  4  5  next 

f  ^  : Gil  • 

Role  B  X 

Ground  (280170) 

Entity  (271007) 

Group  (211720) 

Subset  (187011) 

Figure  (166589) 

Agent  (161146) 

Performer  (145019) 

Cognizer  (141291) 

Person  (137397) 

Unit  (135779) 

back  1  2  3  4  5  next 

V  ^  ^ 

k<:  Content  Type  dx 

ij;  Import  Source  UX 


Concept:  concept  text  Role:  role  text  Both:  concept  &  role 


Figure  2.  The  main  window  of  the  user  interface  to  Contour’s  Semantic  Search  technology. 

Filtering  widgets  and  a  Semantic  Search  box  are  along  the  left  side  of  the  screen.  Search  results 
are  displayed  in  the  large  subwindow  on  the  right  and  center  within  the  main  window. 

Figure  3  is  an  enlarged  composite  picture  of  the  FrameNet-inspired  portions  of  the  user  interface  with  the 
frame  based  filtering  widgets  (left  side)  and  the  Semantic  Search  box  (upper  right  side).  On  the  left  side 
of  Figure  3,  the  FrameNet  frame  Killing  has  been  selected  in  the  Concept  filtering  widget.  The  Concept 
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Killing  would  identify  sentences  containing  words  such  as  assassinate ,  bloodshed ,  destroy ,  fatality , 
massacre ,  murder ,  shooting ,  and  terminate.  After  selecting  the  Killing  Concept,  the  Role  filtering  widget 
will  show  the  decomposition  of  the  Killing  frame  into  its  frame  elements  along  with  the  number  of  tweets 
where  that  Concept  and  Role  are  found.  For  example,  using  the  Killing  Concept  and  Victim  Role  filters, 
4697  tweets  were  found.  On  the  right  side  of  Figure  3,  the  Semantic  Search  box  can  be  used  for  a  targeted 
search  of  the  Killing  frame  (in  the  Concept  text  box)  with  frame  element  Place  (in  the  Role  text  box)  and 
a  Keyword  with  wildcard  characters  *Cairo*.  This  search  will  return  tweets  fitting  the  Killing  frame  with 
the  frame  element  of  Place  matching  Cairo.  Two  example  tweets  returned  by  this  query  are  shown  on  the 
lower  right  side  of  Figure  3  where  the  blue  highlighted  text  is  the  Concept  text,  green  text  is  the  Role  text, 
and  the  yellow  text  is  both  the  Concept  and  the  Role. 
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Role: 
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Keyword: 
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o 

1  of  24.  Demonstrators  take  part  in  a  candlelight 

CNT-APP6-  L 22 2.5  -  Wednesday,  2/6/2013  3r22pm  -  Unknown 

Last  week,  there  was  bi&odshed  rn  Cairo  when  Mubarak  loyalists  in  plain  clothes 


FLASH:  One  protester 

CNT-APP10-20216  -  Thursday,  2/7/2013  3:59pm  -  English 

One  protester  killed  in  shooting  Tn  Cairo's  Tahrir  Square  : 


Figure  3.  The  Concept  and  Role  widgets  with  the  Killing  frame  selected  and  associated  frame 
elements  displayed  in  the  Role  widget  (left  side).  The  Semantic  Search  box  containing  the  Killing 
Concept,  Place  Role,  and  a  Keyword  representing  a  specific  location  *Cairo*  (upper  right  side). 

Two  example  tweets  resulting  from  the  search  criteria  selected  in  the  Semantic  Search  box  (lower 
right  side). 

By  utilizing  Contour’s  frame-based  filtering  widgets  and/or  more  targeted  Semantic  Search  box  a  large 
volume  of  social  media  generated  communications  can  be  efficiently  queried  filtering  out  most  of  the 
non-informative  text  for  possible  collection  of  low-grade  intelligence  information.  The  next  section 
examines  this  frame-based  semantic  search  approach  in  more  detail  by  querying  a  Twitter  dataset  of  over 
seven  million  tweets  collected  during  the  height  of  the  Arab  Spring  protests. 

Use  Case:  Searching  Arab  Spring  Tweets 

Twitter’s  ever  growing  daily  flow  of  millions  of  tweets  includes  all  kinds  of  information.  Tweets  are  short 
messages  of  up  to  140  characters.  Tweets  can  provide  a  set  of  unique  perspectives  reflecting  the  points  of 
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view  of  users  who  are  interested  in  or  participating  in  an  event.  In  unplanned  events,  Twitter  users 
sometimes  spread  news  prior  to  the  traditional  news  media  (Kwak,  Lee,  Park,  &  Moon,  2010;  Sakaki, 
Okazaki,  &  Matsuo,  2010).  For  planned  events,  Twitter  users  often  post  messages  in  anticipation  of  the 
event.  In  this  manner,  Twitter  users  take  on  the  role  of  social  sensors  (Sakaki,  Okazaki,  &  Matsuo,  2010) 
with  the  network  serving  as  a  channel  for  breaking  news  alerts  and  subsequently  streaming  real-time  data 
as  events  unfold.  For  example  during  the  Arab  Spring,  Twitter  connected  Western  and  Arab  individuals  to 
protest  participants  bringing  unique  and  unfiltered  content.  There  was  also  evidence  foreign  governments 
directly  monitored  social  media  sites  to  supplement  their  limited  knowledge  of  what  was  actually 
occurring  during  the  Arab  Spring  protests  (Becker,  Naaman,  &  Gravano,  2011). 

From  an  analysis  perspective,  identifying  low-grade  intelligence  on  Twitter  is  a  challenging  problem  due 
to  the  diversity  of  message  content  and  writing  style,  as  well  as,  the  immense  scale  of  the  data  (i.e.,  half  a 
billion  tweets  a  day;  Costolo,  2012).  Twitter  users  post  messages  with  a  variety  of  content  types  including 
personal  updates,  calls  for  participation,  warnings  about  violence  and  threatening  situations,  and  various 
other  bits  of  information  (Naaman,  Boase,  &  Lai,  2010).  In  additional,  Twitter  messages  by  design, 
contain  little  textual  information  and  are  often  written  informally  with  poor  spelling,  grammar  and 
sentence  structure  (Becker,  Naaman,  &  Gravano,  2011).  To  address  these  challenges,  new  text  analytic 
approaches  such  as  Contour’s  frame-based  semantic  search  capability  are  needed  for  extracting  low-grade 
intelligence  from  social  media  message  streams. 

For  demonstration  purposes,  a  large  set  of  tweets  collected  during  the  Arab  Spring  protest  was  imported 
into  Contour’s  semantic  search  engine.  The  tweet  set  consists  of  approximately  7.3  million  tweets 
collected  between  February  01  to  February  19,  201 1  by  the  Blender  Cross-source  Information  Extraction 
Laboratory  at  the  City  University  of  New  York  (CUNY)  Graduate  Center  under  the  direction  of  Dr.  Heng 
Ji  (Zubiaga,  Ji,  and  Knight,  2013).  This  date  range  covers  the  heart  of  the  rebellion  from  a  few  days  after 
the  first  coordinated  mass  protests  against  Mubarak  on  January  25,  including  Mubarak’s  resignation  on 
February  11,  and  the  rejection  of  a  power  transfer  to  civilian  administration  by  the  Egyptian  military  on 
February  13. 

We  have  developed  a  simplistic  use  case  simulating  a  low-grade  intelligence  collection  effort  involving 
Contour’s  semantic  search  capabilities  applied  to  the  Arab  Spring  tweet  dataset.  The  simulated  user  is  an 
analyst  requiring  information  about  events  that  may  be  brewing  nearby  or  within  an  area  of  operations. 
This  use  case  will  employ  several  different  types  of  semantic  searches  performed  on  the  tweets:  a  Concept 
search  which  is  the  broadest  type  of  search  returning  instances  of  the  selected  Concept  (FrameNet  frame); 
a  combined  Concept  and  Role  search  returning  instances  of  the  Concept  and  Role  (FrameNet  frame  and 
frame  element);  and  the  most  targeted  type  of  search,  a  Concept,  Role,  and  Keyword  search  identifying 
the  specified  Keyword  within  the  semantic  role.  For  these  searches  either  the  filtering  widgets  or  the 
Semantic  Search  box  can  be  used  to  formulate  queries. 

The  use  case  begins  as  an  analyst  receives  an  external  report  about  a  protest  occurring  in  Egypt  possibly 
near  a  Forward  Operating  Base  (FOB).  A  captured  stream  of  tweets  is  imported  into  Contour’s  semantic 
search  engine  using  the  Import  Source  and  Content  Type  widgets.  After  import,  a  broad  semantic  search 
is  performed  using  Contour’s  Concept  filtering  widget  by  selecting  the  Protest  frame  (Figure  4).  The 
Protest  Concept  filters  the  7,397,115  total  tweets  down  to  73,328  tweets  relevant  to  the  Protest  frame. 
Contour’s  results  window  displays  25  tweets  per  page.  Figure  5  shows  three  example  tweets  from  the  first 
page  of  results  using  the  Protest  filter. 
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Figure  4.  The  Concept  filtering  widget  with  the  Protest  frame  selected  (top).  Once  the  Concept  is 
selected  the  Role  filtering  widget  displays  that  frame’s  elements  and  the  number  of  tweets 
containing  the  specific  Concept  and  Role  in  parentheses  (bottom). 


— - - - - - - - ■ - ■ - - — 

Revolution  &#39;Two  Million&#39;  Turn  Out  in  Cairo  Could  t  (protest) 

CNT-APP2-35435  -  Friday,  2/8/2013  4  14pm  -  Unknown 

Al  Jazeera  says  as  many  as  two  million  people  turned  out  in  Cairo  Tuesday  to  protest  Mubarak. 

u.  J 


Figure  5.  Example  tweets  resulting  from  a  search  using  the  Concept  filtering  widget  with  the 
Protest  Concept  selected. 

A  quick  glance  at  the  first  page  of  tweets  informs  the  analyst  that  an  Egyptian  protest  was  triggered  by  the 
fall  of  leader  Zine  El-Abidine  Ben  Ali  in  Tunisia.  The  Egyptians  appear  to  be  protesting  against  rising 
prices,  poverty,  unemployment,  and  Mubarak’s  authoritarian  regime.  The  protests  are  occurring 
somewhere  in  Cairo. 

There  are  four  frame  elements  associated  with  the  Protest  frame  (Place,  Degree,  Action,  Descriptor). 
Selecting  the  Place  Role  will  give  more  information  about  the  location  of  the  protests.  The  analyst  uses 
the  Place  Role  filter  reducing  the  number  of  relevant  tweets  to  7730.  Figure  6  shows  three  example  tweets 
from  the  first  page  of  search  results  using  the  Protest  Concept  and  Place  Role  filtering. 
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Egypt:  Political  Energy  Powers  Exhausted  Protesters  (Protest) 

CNT-APPl 0-4517 5  -  Friday,  2/8/20 1 3  4:04pm  -  Unknown 

Vet  despite  these  fears  of  economic  uncertainty,  protesters  at  Tahrir  remain  steadfast,  calling  for  million-  strong  protests  every  Sunday,  Tuesday  and 
Friday  until  their  demands  for  the  president's  ouster  are  met. 

L  J 


Figure  6.  Example  tweets  resulting  from  a  search  using  the  Concept  filtering  widget  with  the 
Protest  Concept  and  the  Role  filtering  widget  with  the  Place  Role  selected. 

From  these  tweets  the  analyst  identifies  the  primary  location  of  the  protests — Tahrir  Square.  By 
descriptions  such  as  “sprawling”  and  “more  than  100,000”  contained  in  the  tweets,  the  protests  appear 
large  in  size  involving  many  citizens.  One  of  the  tweets  refers  to  “tanks,  armored  personnel  carriers,  and 
soldiers”  indicating  Egyptian  government  forces  may  be  taking  action  against  the  protestors.  Cario’s  local 
roadways  are  being  affected  by  the  protest  activities.  The  analyst  believes  this  may  have  an  impact  on 
planned  patrol  routes  within  the  city  and  especially  near  the  Tahrir  Square  area. 

To  verify  the  extensiveness  of  the  protests  the  analyst  selects  the  Degree  Role  filter  reducing  the  number 
of  relevant  tweets  to  5860  and  confirming  the  protests  are  widespread.  Figure  7  shows  two  example 
tweets  resulting  from  the  Protest  and  Degree  filtering. 


■ - 

(Protest) 
Degree 

Police  and  demonstrators  fought  running  battles  on  the  streets  of  Cairo  on  Friday  in  a  fourth  day  of  unprecedented  protests  by  tens  of  thousands  of 
Egyptians  demanding  an  end  to  President  Hosni  Mubarak's  three- decade  rule. 

v  J 


World  The  Egyptian  protests  against  lack  of  work, 

CNT-APPLO-45  L82  -  Friday,  2/8/2013  4,04pm  -  Unknown 


Figure  7.  Example  tweets  resulting  from  a  search  using  the  Protest  Concept  filter  and  the  Degree 
Role  filter. 

The  resulting  tweets  also  reference  violence  and  possible  communication  disruptions.  To  further 
investigate  the  level  of  violent  activities  associated  with  the  protests,  the  analyst  selects  a  Keyword  filter 
of  “violence”  combined  with  the  Protest  Concept  filtering  (Figure  8).  The  Role  filters  associated  with  the 
Protest  Concept  do  not  include  “violence”  or  anything  similar;  therefore,  a  Keyword  filter  must  be  used. 
The  word  “violence”  is  typed  into  the  Keyword  text  box.  Figure  9  shows  several  tweets  resulting  from  the 
Protest  Concept  and  violence  Keyword  filters. 
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Figure  8.  The  Concept  filtering  widget  with  the  Protest  frame  selected  (top)  and  the  Keyword  filter 
“violence”  typed  into  the  Keywords  widget  (bottom). 


f - "i| 

Friday,  February  4,  2011  Egyptian  Protests--Departing  from  Cairo,  Accepting  Naivety  (protest) 

CNT-APP5-29488  -  Friday,  2/8/2013  2  38pm  -  Unknown 

The  pro-government  supporters,  who  have  been  in  the  street  over  the  past  few  days,  bringing  with  them  the  horrific  violence  that  has  consumed  Tahrir 
Square  are  largely  not  real  protesters  .but  hired  vigilantes  and  members  of  the  ruling  party  and  thepolic 


c - - - - > 

Published  on  Wednesday,  February  9,  2011  by  CommonDreams.org  (Protest) 

CNT-APP6-39920  -  Friday,  2/8/2013  2:38pm  -  Unknown 

Palestinian  Authority/Hamas  The  Palestinian  Authority's  police  used  violence  against  peaceful  demonstrators  during  a  rally  in  Ramallah  on  February  2. 
2011,  to  support  the  pro  festers  in  Egypt, 


V  J 


Figure  9.  Example  tweets  resulting  from  a  search  using  the  Protest  Concept  filter  combined  with  a 
“violence”  Keyword  filter. 

The  tweets  returned  by  the  Protest  Concept  and  violence  Keyword  search  indicate  the  protest  may  have 
started  peacefully  but  now  has  turned  violent  with  looting  and  damage  to  private  and  public  property.  The 
Egyptian  government  is  utilizing  equipment  and  tactics  to  control  this  violence  and  disperse  the 
protesters.  This  will  further  disrupt  traffic  and  impact  any  ground  patrol  units  planning  on  entering  the 
area. 
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In  addition  to  restricted  ground  travel,  the  analyst  is  concerned  about  communications  limitations  in  the 
Cairo  area  because  the  tweet  content  from  a  previous  search  mentioned  “a  shutdown  of  communications” 
(Figure  7).  Again,  the  analyst  uses  a  Keyword  filter  in  conjunction  with  the  Protest  Concept  filter.  Figure 
10  shows  some  example  tweets  returned  from  a  search  using  the  Keyword  “communication”.  These 
tweets  inform  the  analyst  the  government  has  shut  down  a  major  internet  provider  and  cell  phone 
networks  in  the  hopes  of  dispersing  the  protestors.  Also,  Egyptian  government  soldiers  appear  to  be 
communicating  by  CB  radio.  The  analyst  believes  the  information  about  the  communications  blackout 
should  be  sent  up  the  chain  of  command. 

The  tweet  mentioning  the  use  of  CB  radios  by  “Thugs”  implies  government  soldiers  are  “attacking 
protesters.”  Additional  information  is  needed  to  determine  the  level  of  violence  against  civilians.  In  this 
case,  the  analyst  uses  the  Semantic  Search  box  for  a  more  targeted  search  approach.  The  Concept  Killing , 
and  the  Role  Victim  is  selected  from  dropdown  lists  populated  by  the  FrameNet  ontology  of  frames  and 
associated  frame  elements.  When  using  the  Semantic  Search  box,  a  Keyword  is  required  with  optional 
wildcard  characters.  The  Keyword  of  *Cairo*  is  entered  because  the  analyst  is  interested  in  killings 
linked  to  the  protest  activities  occurring  in  Cairo.  Figure  1 1  shows  the  search  criteria  (Concept,  Role,  and 
Keyword)  entered  in  the  Semantic  Search  box.  Figure  12  displays  some  of  the  tweets  returned  by  this 
search  where  the  Killing  Concept  identified  sentences  containing  words  such  as  shooting,  killed,  and 
assassination  (blue  and  yellow  highlighted  text).  The  green  highlighted  text  corresponds  to  the  use  of  the 
*Cairo*  keyword. 


f  i 

Egyptian  government  tightens  grip  ahead  of  protests  (Protest) 

CNT-APP10-4202  -  Wednesday,  2/6/2013  11:48 am  -  Unknown 

Egyptian  government  tightens  grip  ahead  of  protests  By  the  CNN  Wire  Staff  Gunfire  amid  protests  in  Cairo  STORY  HIGHLIGHTS  Major  Internet  provider  shut  down 
Mobile  phone  networks  to  be  shut  down  T rains  ordered  stopped  indefinitely  Internet  (CNN)  --  The  emb 


Figure  10.  Example  tweets  resulting  from  a  search  using  the  Protest  Concept  filter  combined  with 
a  “communication”  Keyword  filter. 
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Figure  11.  The  Semantic  Search  box  with  the  Killing  Concept,  Victim  Role,  and  the  Keyword 
*Cairo*  entered  as  search  criteria. 


f - 

Egypt  Tuesday  LiveBlog:  Will  a  Million  March?  0530  (Killing) 

GNT-APP 10-4  5076  -  Friday,  2/S/2D13  3? 29 pm  -  Unknown  Vitti™3 

Hillary  Clinton  in  a  call  to  Egyptian  Vice  President  described  the  shootin a  on  democracy  protesters  in  Tahrir  Square  in  Cairo  as  'shocking1. 


Mubarak  supporters  battle  with  Egypt  opposition  protesters 

CNT-APP7 -9347  -  Wednesday.  2/6/2013  12:28pm  -  Unknown 


(Killing) 
Viet  i  mVicti  rai 


6  59  PM,  February  2,  2011  CAIRO  -  At  least  three  people  were  killed  Wednesday  and  hundreds  more  wounded  on  a  day  of  bloodshed  in  Cairo  that  saw  violent 
clashes  between  supporters  and  protesters  of  Egyptian  President  Hosni  Mubarak's  crumbling  regime. 


Egypt  News.Net  Sunday  6th  February,  2011  Protester  (Killing) 

CNT-APP9-G194  -  Wednesday,  2/6/2013  12.08pm  -  Unknown  KMItrVittim 

TahrirSqua  re  continued  to  bethehub  ofthepro  tes  ts  a  g  a  i  n  s  t  M  u  b  a  ra  k,  wh  o  has  ru  I  ed  Egyp  t  c  o  nti  n  o  u  s  ly  ever  s  i  n  c  e  h  e  wa  s  swo  rn  in  in  1981  fc  1 1  owi  n  g  th  e 
assassination  of  then  president  Anwar  Sadat  at  a  military  parade  in  Cairo  . 


One  person  killed,  m  (Killing) 

CNT-APP1Q- 24867  -  Thursday,  2/7/2013  4i2lpm  -  English  Victim 

One  person  killed  .  more  than  40 0  wounded  Wednesday  ire  clashes  in  Cairo  ,  a  government  official  says. 


Opponents  and  West  p  (Killing) 

CNT- APP9 -2 29 3 5  -  Thursday,.  2/7/2013  4;22pm  -  English  Vittim 

Opponents  and  West  press  Mubarak  after  6  killed  CAIRO  (Reuters)  -  Gunmen  fired  on  a  nti -government  protesters  in  Cairo,  w  http  //tiny.ly/rZDJ 


\u201c@monaeltahawy: 

CNT-APP2-  81126  -  Monday.  2/18/2013  11  40am  -  English 

Snipers  kill  three  Egyptian  uprisers  in  downtown  Cairo .  according  to  @rassdwehda  “Egypt  =25Jan',u20  Id  war  against  his  own  people! 


(Killing) 

KillerVittim 


Figure  12.  Example  tweets  resulting  from  using  the  Semantic  Search  box  with  search  criteria 
Killing  Concept,  Victim  Role  and  the  Keyword  *Cairo*. 


This  use  case  demonstrated  how  an  analyst  could  collect  low-grade  intelligence  information  from  a  large 
stream  of  social  media  data  using  Contour’s  semantic  search  interface.  With  the  frame-based  filtering 
widgets  (Concept  and  Role)  and  an  optional  keyword  to  further  narrow  the  search,  the  analyst  was  able  to 
quickly  and  concisely  construct  a  series  of  queries  using  a  mouse  and  dropdown  menus.  These  queries 
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were  used  to  iteratively  search  through  the  content  of  over  seven  million  tweets  for  relevant  bits  of 
information  pertaining  to  events  occurring  within  an  area  of  operations. 

The  low-grade  information  extracted  from  the  tweets,  when  added  together,  forms  an  operational  picture 
of  widespread  protests  in  the  area  of  Tahrir  Square,  Cairo.  The  protests  which  began  as  peaceful 
demonstrations  against  Mubarak  and  his  government  rapidly  escalated  to  violent  clashes  with  the 
Egyptian  military.  The  analyst  learned  transportation  routes  were  blocked  and  communication  services 
shut  down.  The  Egyptian  military  communicating  via  CB  radios  and  using  tanks  were  firing  on  protestors 
resulting  in  many  civilian  casualties.  This  information  can  complement  high-grade  intelligence  supporting 
command  level  decision  makers  in  carrying  out  ongoing  missions  and  mitigating  emergent  threats. 

Discussion 

The  use  case  in  the  previous  section,  although  simplistic  in  nature,  portrays  how  an  analyst  could  search  a 
large  stream  of  tweets  for  low-grade  information  that  when  added  together  produces  actionable 
intelligence  within  a  collection  resource  deprived  operational  environment.  Two  of  the  most  important 
concerns  of  the  analyst  would  be  to  gather  information  about  what  events  are  occurring  or  may  occur 
within  a  specific  geographic  region  such  as  an  area  of  operations  or  forward  operating  base,  and  how 
these  events  might  affect  missions  such  as  ground  patrol  units  or  forward  communications.  The  use  case 
simulated  the  construction  of  several  search  queries  at  different  levels  of  complexity  to  extract  useful 
information  from  seven  million  tweets  reporting  the  rapidly  deteriorating  situation  in  Tahrir  Square. 

Contour’s  widget-based  search  filtering  system  offers  many  advantages.  Queries  at  three  different  levels 
of  complexity  can  be  constructed  using  a  mouse  with  minimal  text  input.  Query  construction  is  intuitive 
with  dropdown  lists  displaying  the  available  FrameNet  Concepts  (frames)  and  associated  Roles  (frame 
elements)  for  mouse-click  selection,  and  an  optional  Keyword  text  entry  widget  to  narrow  the  query  focus 
further.  In  addition,  the  designed  simplicity  of  the  main  window  of  the  user  interface  lends  itself  to 
mobile  computing  devices  where  display  screen  size  is  restricted.  Other  capabilities  of  the  semantic 
search  interface  not  covered  in  the  use  case  include  searching  by  pre-defined  date  range  filters,  and  a  text 
box  for  more  traditional  manual  query  entry  using  Concepts,  Roles,  Keywords  including  wildcard 
characters  and  logical  operators. 

There  are  many  challenges  and  stresses  associated  with  the  collection  and  analysis  of  high-grade 
intelligence,  for  example,  the  lack  of  network  security  to  access  classified  reports  or  unsafe  conditions  to 
observe  and  document  potentially  threatening  activities.  In  these  types  of  data  processing  environments, 
Contour’s  point  and  click  query  construction  enables  the  collection  of  low-grade  intelligence  information 
from  easily  accessible  social  media  sources  by  inexperienced  analysts  in  support  of  a  common  operating 
picture. 

Contour’s  semantic  search  technology  is  in  the  prototype  stage  of  development.  There  are  two  areas  for 
improvement — extending  the  FrameNet  ontology,  and  integrating  real-time  processing  of  social  media 
content.  Contour’s  Concepts  and  Roles  widgets  are  populated  by  the  FrameNet  ontology.  The  FrameNet 
project  represents  the  culmination  of  decades’  worth  of  research  on  how  to  depict  knowledge  for  language 
processing  systems.  FrameNet,  primarily  funded  by  the  National  Science  Foundation  and  housed  at  the 
International  Computer  Science  Institute  in  Berkeley,  California,  was  designed  as  a  general  ontology  to 
model  the  meaning  of  text.  Intelligence  analysts  draw  upon  data  from  many  different  types  of  sources 
including  command  and  control  and  mission  support;  therefore,  ontologies  for  semantic  search  systems 
must  be  able  to  accommodate  a  wide  variety  of  Concepts  and  Roles.  Ontology  gaps  can  become 
problematic  when  attempting  to  analyze  text  within  dynamic  military  environments.  Because  intelligence 
analysts  frequently  need  to  understand  and  adapt  to  new  situations,  methods  of  automatically  expanding 
FrameNet’s  ontology  would  improve  the  future  capabilities  of  Contour’s  semantic  search  engine. 
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Currently,  the  social  media  collection  and  pre-processing,  and  the  semantic  search  analysis  are  a  two-step 
process  utilizing  different  technology  platforms.  An  external  collection  mechanism  is  used  to  capture 
streams  of  social  media  content  such  as  tweets.  Then  the  captured  text  messages  are  pre-processed  (i.e., 
SLR),  and  imported  into  Contour’s  semantic  search  engine.  One  improvement  to  this  process  to  better 
utilize  the  dynamic  nature  of  social  media  is  to  seamlessly  ingest,  pre-process,  and  analyze  the  message 
streams  for  semantic  search.  Identifying  events  as  they  occur  or  even  anticipating  what  kind  of  event  will 
happen,  and  where,  based  on  social  media  user-contributed  messages,  would  be  a  valuable  intelligence 
collection  asset  and  a  future  area  of  development  for  Contour. 

Conclusion 

Social  media  users  produce  millions  of  messages  a  day  sometimes  live-tweeting  about  critical  events  such 
as  natural  disasters  or  political  transitions.  While  much  of  the  content  of  social  media  is  unstructured  and 
non-informative,  advanced  analytic  technologies  such  as  Contour’s  FrameNet-based  semantic  search 
engine  can  identify  small  bits  of  low-grade  intelligence  information  present  within  social  media  streams. 
This  low-grade  information  when  added  together  can  produce  actionable  intelligence  within  a  collection 
resource  deprived  operational  environment.  Contour’s  semantic  search  process  maps  words  and  phrases 
in  social  media  content  to  FrameNet  frames  and  frame  elements  using  Concept  and  Role  filtering  widgets. 
This  widget  interface  allows  three  levels  of  complexity  in  query  formulation  using  intuitive  dropdown 
menus  and  optional  keyword  text  entry. 

A  use  case  demonstrated  Contour’s  primary  semantic  search  capabilities  in  the  context  of  an  analyst 
tasked  with  gathering  information  about  events  occurring  within  a  specific  geographic  region  such  as  a 
forward  operating  base.  The  use  case  simulated  the  construction  of  several  search  queries  at  different 
levels  of  complexity  to  extract  useful  information  from  over  seven  million  tweets  reporting  the  rapidly 
deteriorating  security  situation  in  Tahrir  Square  during  the  Egypt  uprisings. 
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